Effect of Dynamic time Warping Based Alignment on the Accuracy of the Transformation Function for Voice Conversion

نویسنده

  • Radhika Khanna
چکیده

Absract--Voice conversion involves transformation of speaker characteristics in a speech uttered by a speaker called source speaker to generate a speech having voice characteristics of a desired speaker called the target speaker. Voice conversion is used in many applications namely dubbing, to enhance the quality of the speech, text-to-speech synthesizers, online games, multimedia, music, cross-language speaker conversion, restoration of old audio tapes, cellular applications, low bit-rate speech coding, etc. There are various models used for voice conversion such as Hidden Markov Model (HMM), Artificial Neural Network (ANN), Dynamic Time Warping (DTW), and Vector Quantization (VQ). The quality and the identity conveyed by the transformed speech depend upon the accuracy of the transformation function derived from the given training data. The estimation of the transformation function requires properly alignedpassages spoken by source and target speakers. Exact alignmentof the corresponding speech units in the source and target passages is mandatory for the accurate estimation of the transformation function as the durations of speech units (i.e. phonemes or sub-phonemes) mayhave quite different distributions among speakers.Generally, DTW and VQ are used for this purpose. The objective of this paper is to compare the effectiveness of DTW and VQ based estimation of the transformation function. The analysis of the results shows that DTW provides about five percent more reduction in the transformed target distances of the speech. It means, DTW based technique is relatively better for the estimation of the transformation function.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

To Investigate the Accuracy of the Dynamic Time Warping Based Transformation Function for Voice Conversion

Voice conversion involves transformation of speaker characteristics in a speech uttered by a speaker called source speaker so as to generate a speech having voice characteristics of a desired speaker called target speaker. Voice conversion technology is used in many applications namely dubbing, to enhance the quality of the speech, text-to-speech synthesizers, online games, multimedia, music, c...

متن کامل

Using Context-based Statistical Models to Promote the Quality of Voice Conversion Systems

This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...

متن کامل

Optimal Current Meter Placement for Accurate Fault Location Purpose using Dynamic Time Warping

This paper presents a fault location technique for transmission lines with minimum current measurement. This algorithm investigates proper current ratios for fault location problem based on thevenin theory in faulty power networks and calculation of short circuit currents in each branch. These current ratios are extracted regarding lowest sensitivity on thevenin impedance variations of the netw...

متن کامل

On the impact of alignment on voice conversion performance

Most of the current voice conversion systems model the joint density of source and target features using a Gaussian mixture model. An inherent property of this approach is that the source and target features have to be properly aligned for the training. It is intuitively clear that the accuracy of the alignment has some effect on the conversion quality but this issue has not been thoroughly stu...

متن کامل

طراحی یک روش آموزش ناموازی جدید برای تبدیل گفتار با عملکردی بهتر از آموزش موازی

Introduction: The art of voice mimicking by computers, has with the computer have been one of the most challenging topics of speech processing in recent years. The system of voice conversion has two sides. In one side, the speaker is the source that his or her voice has been changed for mimicking the target speaker’s voice (which is on the other side). Two methods of p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013